Unit fusion for concatenative speech synthesis
نویسندگان
چکیده
An important problem in concatenative synthesis is the occurence of spectral discontinuities or “concatenation mismatch” between sonorant speech units. In this paper, we present an approach to reduce concatenation mismatch by combining spectral information from two sequences of speech units selected in parallel. Concatenation units, on one hand, define initial spectral trajectories for a target utterance. Fusion units, on the other hand, define the desired transitions between concatenated units. The two unit sequences are “fused” by imposing dynamic constraints defined by the fusion units on the spectral trajectories of the concatenation units. To regenerate the modified speech units, we use a synthesis algorithm based on sinusoidal + all-pole analysis of speech, which overcomes the limitations of residual-excited LPC. Results from a perceptual test show that our method is highly successful at removing concatenation artifacts in speech generated from an inventory of diphones.
منابع مشابه
An artificial intelligence approach to concatenative sound synthesis
iii Content Overview v-vii List of Figures viii-x List of Tables xi-xii List of Abbreviations xiii-xiv Acknowledgments xv-xvi Author’s Declaration xvii CHAPTER 1: INTRODUCTION 1 1.1 Motivation 1 1.2 Introduction 7 1.3 Objectives 14 1.4 Thesis Structure 18 CHAPTER 2: PRINCIPLES OF CONCATENATIVE SOUND SYNTHESIS 20 2.1 Sound Synthesis 20 2.1.1 Rule-based Model 23 2.1.2 Data-driven Model 27 2.2 Sub...
متن کاملForward Masking Phenomenon in Concatenative Speech Synthesis
The approach described in the paper tries to get more knowledge to the concatenative text-to-speech system design. The knowledge is based on masking phenomenon of the inner ear, particularly of its temporal (forward) masking properties. Designing such knowledge-based system is suggested to use in the unit selection-based speech synthesis, as contemporary a prominent technique in concatenative s...
متن کاملConcatenative speech synthesis for European Portuguese
This paper describes our on-going work in the area of text-tospeech synthesis, specifically on concatenative techniques. Our preliminary work consisted in investigating the current trends in concatenative synthesis and the problems that could arise when we apply the existing state-of-the art solutions to the specific case of European Portuguese. Our ultimate goal is to develop a text-to-speech ...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کاملA System for Data-driven Concatenative Sound Synthesis
In speech synthesis, concatenative data-driven synthesis methods prevail. They use a database of recorded speech and a unit selection algorithm that selects the segments that match best the utterance to be synthesized. Transferring these ideas to musical sound synthesis allows a new method of high quality sound synthesis. Usual synthesis methods are based on a model of the sound signal. It is v...
متن کامل